SUMMARY:
A PROJECT USING A SECTION OF THE KAGGLE DOGS VS. CATS DATA SET TO EXPLORE THE IMPACT OF VARIOUS TRAINING VARIABLES ON MODEL TRAINING AND OVERALL MODEL ACCURACY

TABLE OF CONTENTS

Part 1: Load and inspect Pre-trained Convolutional Neural Network (CNN)

Part 2: Set up image data

Part 3: Transfer Learning

Part 4: Data augmentation

Part 5: Repeat steps 1 to 4 with exploration of model training variables

--Part 5.1: Train the network-classifier for longer time (i.e., increase the number of epochs)

--Part 5.2: Train the network-classifier with a larger number of images

--Part 5.3: Try different (%) values for partitioning the dataset into training and validation sets

--Part 5.4: Choose different hyperparameters (optimizer, learning rate, etc.) for network training

--Part 5.5: Try different image data augmentation options (and values/ranges)

--Part 5.6: Use a different pretrained model

Functions

Part 1: Load and inspect Pre-trained Convolutional Neural Network (CNN)

Return to table of contents

1.1: Loading a pre-trained "AlexNet"

% Ensure that you have downloaded and installed the
% "Deep Learning Toolbox Model for AlexNet Network" support package.
 
% See https://www.mathworks.com/matlabcentral/fileexchange/59133-deep-learning-toolbox-model-for-alexnet-network
% and
% https://www.mathworks.com/help/deeplearning/ug/pretrained-convolutional-neural-networks.html
% for additional information.
 
model1 = alexnet
model1 =
SeriesNetwork with properties: Layers: [25×1 nnet.cnn.layer.Layer] InputNames: {'data'} OutputNames: {'output'}

1.2: Inspect the CNN's layers

model1.Layers
ans =
25×1 Layer array with layers: 1 'data' Image Input 227×227×3 images with 'zerocenter' normalization 2 'conv1' 2-D Convolution 96 11×11×3 convolutions with stride [4 4] and padding [0 0 0 0] 3 'relu1' ReLU ReLU 4 'norm1' Cross Channel Normalization cross channel normalization with 5 channels per element 5 'pool1' 2-D Max Pooling 3×3 max pooling with stride [2 2] and padding [0 0 0 0] 6 'conv2' 2-D Grouped Convolution 2 groups of 128 5×5×48 convolutions with stride [1 1] and padding [2 2 2 2] 7 'relu2' ReLU ReLU 8 'norm2' Cross Channel Normalization cross channel normalization with 5 channels per element 9 'pool2' 2-D Max Pooling 3×3 max pooling with stride [2 2] and padding [0 0 0 0] 10 'conv3' 2-D Convolution 384 3×3×256 convolutions with stride [1 1] and padding [1 1 1 1] 11 'relu3' ReLU ReLU 12 'conv4' 2-D Grouped Convolution 2 groups of 192 3×3×192 convolutions with stride [1 1] and padding [1 1 1 1] 13 'relu4' ReLU ReLU 14 'conv5' 2-D Grouped Convolution 2 groups of 128 3×3×192 convolutions with stride [1 1] and padding [1 1 1 1] 15 'relu5' ReLU ReLU 16 'pool5' 2-D Max Pooling 3×3 max pooling with stride [2 2] and padding [0 0 0 0] 17 'fc6' Fully Connected 4096 fully connected layer 18 'relu6' ReLU ReLU 19 'drop6' Dropout 50% dropout 20 'fc7' Fully Connected 4096 fully connected layer 21 'relu7' ReLU ReLU 22 'drop7' Dropout 50% dropout 23 'fc8' Fully Connected 1000 fully connected layer 24 'prob' Softmax softmax 25 'output' Classification Output crossentropyex with 'tench' and 999 other classes
 
% The intermediate layers make up the bulk of the CNN. These are a series
% of convolutional layers, interspersed with rectified linear units (ReLU)
% and max-pooling layers [2]. Following these layers are 3
% fully-connected layers.
 
% The final layer is the classification layer and its properties depend on
% the classification task. In this example, the CNN model that was loaded
% was trained to solve a 1000-way classification problem. Thus the
% classification layer has 1000 classes from the ImageNet dataset.
 
% Inspect the last layer
model1.Layers(end)
ans =
ClassificationOutputLayer with properties: Name: 'output' Classes: [1000×1 categorical] ClassWeights: 'none' OutputSize: 1000 Hyperparameters LossFunction: 'crossentropyex'
 
% Number of class names for ImageNet classification task
numel(model1.Layers(end).ClassNames)
ans = 1000
 
% Note that the CNN model is not going to be used for the original
% classification task. It is going to be re-purposed to solve a different
% classification task on the pets dataset.

Part 2: Set up image data

Return to table of contents

2.1: Load simplified dataset and build image store

dataFolder = './data/PetImages'; % Defines the path to the directory where the image data is stored
categories = {'cat', 'dog'}; % Creates a cell array containing the category names. In this case, it specifies two categories: 'cat' and 'dog'
% Creates an imageDatastore object by specifying the folder names containing the image data (dataFolder) and the category labels for each folder.
% Each folder name corresponds to a category label.
% % The 'LabelSource' parameter is set to 'foldernames', indicating that the category labels are derived from the names of the folders
imds = imageDatastore(fullfile(dataFolder, categories), LabelSource = 'foldernames');
% Counts the number of images in each category using the countEachLabel function applied to the imageDatastore object
tbl = countEachLabel(imds);
% Displays the table containing the count of images in each category.
disp (tbl)
Label Count _____ _____ cat 20 dog 20
% Ensure that each class (category) has the same number of images by trimming the dataset to the size of the smallest class
% (useful when the two classes have different number of elements but not needed in this case)
minSetCount = min(tbl{:,2}); % Calculate the minimum number of images among all classes
 
% split the image datastore (imds) into two new datastores,
% each containing the same number of images as specified by minSetCount.
% This function ensures that each class has an equal number of random images.
imds = splitEachLabel(imds, minSetCount, 'randomize');
 
% Notice that each set now has exactly the same number of images.
countEachLabel(imds)
ans = 2×2 table
 LabelCount
1cat20
2dog20

2.2: Pre-process Images For CNN

AlexNet can only process RGB images that are 227-by-227. To avoid re-saving all the images to this format, setup the imds read function, imds.ReadFcn, to pre-process images on-the-fly. The imds.ReadFcn is called every time an image is read from the ImageDatastore.
Set the ImageDatastore ReadFcn
image_size = 227;
imds.ReadFcn = @(filename)readAndPreprocessImage(filename, image_size);

2.3: Divide data into training and validation sets

% split dat into 70 % for training set, 30 % for validation set
[trainingSet, validationSet] = splitEachLabel(imds, 0.7, 'randomized');
 
% display number of training set images
countEachLabel(trainingSet)
ans = 2×2 table
 LabelCount
1cat14
2dog14
% display number of validation set images
countEachLabel(validationSet)
ans = 2×2 table
 LabelCount
1cat6
2dog6

Part 3: Transfer Learning

Return to table of contents
% The convolutional layers of the network extract image features that the
% last learnable layer and the final classification layer use to classify
% the input image.
 
% To retrain a pretrained network to classify new images, we must replace these
% last layers with new layers adapted to the new data set.

3.1: Freeze all but last three layers

layersTransfer = model1.Layers(1:end-3); % Capture as "layersTransfer" all but last three layers of original pretrained network
numClasses = 2; % cat and dog
 
% Define new set of layers for the transfer learning model.
% Start with layers from the pretrained network (layersTransfer), then appends new layers:
layers = [
layersTransfer
% Fully connected layer with numClasses output neurons with specified
% weights and biases
fullyConnectedLayer(numClasses,'WeightLearnRateFactor',20,'BiasLearnRateFactor',20)
% Softmax activation function layer that produces class probabilities
softmaxLayer
% layer defines the final classification output and computes the cross-entropy loss
% between the predicted probabilities and the true labels.
classificationLayer];

3.2: Configure training options

% define options to train model
% Start MaxEpochs at 5
% Use UN-augmented training images
% Use "layers" modified pretrained alexnet network
options = trainingOptions('sgdm', ...
'MiniBatchSize',10, ...
'MaxEpochs',5, ...
'InitialLearnRate',1e-4, ...
'Shuffle','every-epoch', ...
'ValidationData',validationSet, ...
'ValidationFrequency',3, ...
'Verbose',false, ...
'Plots','training-progress');
% Verify dimensions of images in imageDatastore
% Read a single image from the imageDatastore
img = readimage(imds, 1);
% Get the dimensions of the image
dimensions = size(img);
% Display the dimensions of the image
fprintf('Image dimensions: [%d, %d, %d]\n', dimensions(1), dimensions(2), dimensions(3));
Image dimensions: [227, 227, 3]

3.3: Retrain network

model2 = trainNetwork(trainingSet,layers,options);
 
% modelTransfer = trainNetwork(trainingSet, layerGraph(model),options); % for squeezenet

3.4: Classify the validation images using the fine-tuned network.

% The YPred variable holds the predicted labels for the images in the validationSet.
% The scores variable contains the scores associated with each class for each image in the validationSet.
% It is a matrix where each row corresponds to an image in the validationSet, and each column corresponds to a class.
% The values in scores represent the confidence or probability of each class for each image.
[YPred,scores] = classify(model2,validationSet);

3.5: Calculate the classification accuracy on the validation set.

Accuracy is the fraction of labels that the network predicts correctly.
YValidation = validationSet.Labels; % determine actual validation image set labels
% Compute fraction of correctly predicted labels
% by comparing predicted versus actual validation image set labels
accuracy = mean(YPred == YValidation);
fprintf("The validation accuracy is: %.2f %%\n", accuracy * 100);
The validation accuracy is: 100.00 %
-- plot confusion matrix
% Define class labels
classNames = unique([YValidationAug; YPredAug]);
% Compute the confusion matrix
C = confusionmat(YValidationAug, YPredAug, 'Order', classNames);
% Visualize the confusion matrix as a heatmap
figure;
heatmap(classNames, classNames, C);
title('Confusion Matrix');
xlabel('Predicted Labels');
ylabel('True Labels');
-- inspect incorrect classifications
% Find indices where the true labels and predicted labels do not match
incorrect_indices = find(YValidationAug ~= YPredAug);
if isempty(incorrect_indices)
disp("No images misclassified")
 
else
% Display some of the incorrectly classified images
numImagesToShow = min(10, numel(incorrect_indices)); % Show at most 10 images
for i = 1:numImagesToShow
subplot(2, 5, i);
idx = incorrect_indices(i);
% Extract the file path from the cell array
file_path_cell = augimdsValidation.Files(idx);
% Convert the cell array into a string
file_path_string = char(file_path_cell);
% Show image with true and predicted labels
img = imread(file_path_string);
img_resized = imresize(img, [400, 600]);
imshow(img_resized)
title(sprintf('True: %s, \n Predicted: %s', YValidationAug(idx), YPredAug(idx) ));
end
end

3.6: Test it on unseen images

newImage1 = './dog.jpg'; % any dog image should do!
img1 = readAndPreprocessImage(newImage1, 227); % read and preprocess the image stored at the specified path
YPred1 = predict(model2,img1); % predict the label probabilities for the preprocessed image using trained model
[confidence1,idx1] = max(YPred1);% maximum predicted probability and its confidence level
label1 = categories{idx1}; % retrieve the label associated with the maximum probability index from the categories array
% Display test image and assigned label
figure
imshow(img1)
title(string(label1) + ", " + num2str(100*confidence1) + "%");
 
newImage2 = './cat.jpg'; % any cat image should do!
img2 = readAndPreprocessImage(newImage2, 227);
YPred2 = predict(model2,img2);
[confidence2,idx2] = max(YPred2);
label2 = categories{idx2};
% Display test image and assigned label
figure
imshow(img2)
title(string(label2) + ", " + num2str(100*confidence2) + "%");

3.7: Test it on unseen images: Your turn!

% What about the iconic "Doge"?
% ENTER YOUR CODE HERE
newImage3 = './doge.jpg'; % any dog image should do!
img3 = readAndPreprocessImage(newImage3, 227);
YPred3 = predict(model2,img3);
[confidence3,idx3] = max(YPred3);
label3 = categories{idx3};
% Display test image and assigned label
figure
imshow(img3)
title(string(label3) + ", " + num2str(100*confidence3) + "%");

Part 4: Data augmentation

Return to table of contents
% Data augmentation helps prevent the network from overfitting and
% memorizing the exact details of the training images.
 
% In MATLAB, this can be done using the "Augmented Image Datastore"
% (https://www.mathworks.com/help/deeplearning/ref/augmentedimagedatastore.html)
 
% Despite its name, however, it DOES NOT increase the actual number of
% samples. When you use an augmented image datastore as a source of
% training images, the datastore randomly perturbs the training data for
% each epoch, so that each epoch uses a slightly different data set. The
% actual number of training images at each epoch does not change. The
% transformed images are not stored in memory.

4.1: Defining the imageAugmenter object

In our case, we shall use an augmented image datastore to randomly flip the training images along the vertical axis and randomly translate them up to 30 pixels and scale them up to 10% horizontally and vertically.
pixelRange = [-30 30];
scaleRange = [0.9 1.1];
imageAugmenter = imageDataAugmenter( ...
'RandXReflection',true, ...
'RandXTranslation',pixelRange, ...
'RandYTranslation',pixelRange, ...
'RandXScale',scaleRange, ...
'RandYScale',scaleRange);

4.2: Building the augmented training and validation sets

% retrieve the input size of the first layer of the pretrained model
inputSize = model1.Layers(1).InputSize;
 
% create an augmented image datastore, augimdsTrain, using the training set (trainingSet) and the input size of the model.
augimdsTrain = augmentedImageDatastore(inputSize(1:2),trainingSet, DataAugmentation = imageAugmenter);
disp(augimdsTrain.NumObservations) % You should see 28
28
 
% create an augmented image datastore, augimdsValidation, using the validation set (validationSet) and the input size of the model.
augimdsValidation = augmentedImageDatastore(inputSize(1:2),validationSet);
disp(augimdsValidation.NumObservations) % You should see 12
12

4.3: Train the network with augmented datasets

% Increase MaxEpochs from 5 to 8
% Use augmented training images
% Use established "layers" modified pretrained alexnet network
miniBatchSize = 10;
options = trainingOptions('sgdm', ...
'MiniBatchSize',miniBatchSize, ...
'MaxEpochs',8, ...
'InitialLearnRate',3e-4, ...
'ValidationData',augimdsTrain, ...
'Verbose',false, ...
'Plots','training-progress');
 
model3 = trainNetwork(augimdsTrain,layers,options);

4.4: Classify the augmented validation images using the fine-tuned network.

% The YPredAug variable holds the predicted labels for the images in the augmented validationSet.
% The probsAug variable contains the scores associated with each class for each image in the augmented validationSet.
% It is a matrix where each row corresponds to an image in the augmented validationSet, and each column corresponds to a class.
% The values in scores represent the confidence or probability of each class for each image.
[YPredAug,probsAug] = classify(model3,augimdsValidation);

4.5: Calculate the classification accuracy on the validation set.

Accuracy is the fraction of labels that the network predicts correctly.
YValidationAug = validationSet.Labels; % determine actual validation image set labels
% Compute fraction of correctly predicted labels
% by comparing predicted versus actual augmented validation image set labels
accuracyAug = mean(YPredAug == YValidationAug);
fprintf("The validation accuracy is: %.2f %%\n", accuracyAug * 100);
The validation accuracy is: 100.00 %
-- plot confusion matrix
% Define class labels
classNames = unique([YValidationAug; YPredAug]);
% Compute the confusion matrix
C = confusionmat(YValidationAug, YPredAug, 'Order', classNames);
% Visualize the confusion matrix as a heatmap
figure;
heatmap(classNames, classNames, C);
title('Confusion Matrix');
xlabel('Predicted Labels');
ylabel('True Labels');
-- inspect incorrect classifications
% Find indices where the true labels and predicted labels do not match
incorrect_indices = find(YValidationAug ~= YPredAug);
if isempty(incorrect_indices)
disp("No images misclassified")
 
else
% Display some of the incorrectly classified images
numImagesToShow = min(10, numel(incorrect_indices)); % Show at most 10 images
for i = 1:numImagesToShow
subplot(2, 5, i);
idx = incorrect_indices(i);
% Extract the file path from the cell array
file_path_cell = augimdsValidation.Files(idx);
% Convert the cell array into a string
file_path_string = char(file_path_cell);
% Show image with true and predicted labels
img = imread(file_path_string);
img_resized = imresize(img, [400, 600]);
imshow(img_resized)
title(sprintf('True: %s, \n Predicted: %s', YValidationAug(idx), YPredAug(idx) ));
end
end
No images misclassified

4.6: Test it on unseen images

newImage1 = './dog.jpg'; % any dog image should do!
img1 = readAndPreprocessImage(newImage1, 227); % read and preprocess the image stored at the specified path
YPred1 = predict(model3,img1); % predict the label probabilities for the preprocessed image using trained model
[confidence1,idx1] = max(YPred1);% maximum predicted probability and its confidence level
label1 = categories{idx1}; % retrieve the label associated with the maximum probability index from the categories array
% Display test image and assigned label
figure
imshow(img1)
title(string(label1) + ", " + num2str(100*confidence1) + "%");
newImage2 = './cat.jpg'; % any cat image should do!
img2 = readAndPreprocessImage(newImage2, 227);
YPred2 = predict(model3,img2);
[confidence2,idx2] = max(YPred2);
label2 = categories{idx2};
% Display test image and assigned label
figure
imshow(img2)
title(string(label2) + ", " + num2str(100*confidence2) + "%");
% What about the iconic "Doge"?
% ENTER YOUR CODE HERE
newImage3 = './doge.jpg'; % any dog image should do!
img3 = readAndPreprocessImage(newImage3, 227);
YPred3 = predict(model3,img3);
[confidence3,idx3] = max(YPred3);
label3 = categories{idx3};
% Display test image and assigned label
figure
imshow(img3)
title(string(label3) + ", " + num2str(100*confidence3) + "%");
-- Conclusions/Lessons/Insights:
Using the augmented training set and augmented validation set did not seem to improve model performance very much in this case. This may be becasue the model is simple and the data set is small.

Part 5: Repeat steps 1 to 4 with exploration of model training variables

Return to table of contents

Load small data set

% Path to small data set
dataFolder_small = './data/PetImages';
% Category names of small data set
categories = {'cat', 'dog'};
% Creates an imageDatastore object holding small data set
imds_small = imageDatastore(fullfile(dataFolder_small, categories), LabelSource = 'foldernames');
% Counts the number of images in each category of imageDatastore object
tbl_small = countEachLabel(imds_small);
% Display images in each category of imageDatastore object as a table
disp (tbl_small)
Label Count _____ _____ cat 20 dog 20
% Calculate the minimum number of images among all classes
minSetCount_small = min(tbl_small{:,2});
% Split the image datastore (imds) into two new datastores,
% each containing the same number of images as specified by minSetCount.
% This function ensures that each class has an equal number of random images.
imds_small = splitEachLabel(imds_small, minSetCount_small, 'randomize');
% Verify each set now has exactly the same number of images.
countEachLabel(imds_small)
ans = 2×2 table
 LabelCount
1cat20
2dog20
 
% External function to resize all images to 227 x 227 RGB in small data set for Alexnet
image_size = 227;
imds_small.ReadFcn = @(filename)readAndPreprocessImage(filename, image_size);

Divide small data set into training and validation sets

% split data into 70 % for training set, 30 % for validation set
[trainingSet_small, remainingSet_small] = splitEachLabel(imds_small, 0.7, 'randomized');
[testSet_small, validationSet_small] = splitEachLabel(remainingSet_small, 0.5, 'randomized');
% display number of training set images
countEachLabel(trainingSet_small)
ans = 2×2 table
 LabelCount
1cat14
2dog14
% display number of validation set images
countEachLabel(validationSet_small)
ans = 2×2 table
 LabelCount
1cat3
2dog3
% display number of test set images
countEachLabel(testSet_small)
ans = 2×2 table
 LabelCount
1cat3
2dog3

Augment small training and validation sets

% Data augmentation
pixelRange = [-30 30];
scaleRange = [0.9 1.1];
imageAugmenter = imageDataAugmenter( ...
'RandXReflection',true, ...
'RandXTranslation',pixelRange, ...
'RandYTranslation',pixelRange, ...
'RandXScale',scaleRange, ...
'RandYScale',scaleRange);
 
% Define augmented training and validation data sets
% retrieve the input size of the first layer of the pretrained model
inputSize = model1.Layers(1).InputSize;
 
% Create an augmented image datastore using the training set and the input size of the model.
augimdsTrain_small = augmentedImageDatastore(inputSize(1:2),trainingSet_small, DataAugmentation = imageAugmenter);
disp(augimdsTrain_small.NumObservations)
28
 
% Create an augmented image datastore using the validation set and the input size of the model.
augimdsValidation_small = augmentedImageDatastore(inputSize(1:2),validationSet_small);
disp(augimdsValidation_small.NumObservations)
6
% Create an augmented image datastore using the test set and the input size of the model.
augimdsTest_small = augmentedImageDatastore(inputSize(1:2),testSet_small);
disp(augimdsTest_small.NumObservations)
6

Load large data set

% the dogs-vs-cats kaggle dat set was downloaded and found to contain 12500
% images for the dog class and 12500 inages for the cat class.
% For brevity, only 500 images were used from each class as shown below.
% Path to large data set
dataFolder_larger = './larger_data/larger_PetImages';
% Category names of large data set
categories = {'cat', 'dog'};
% Creates an imageDatastore object holding large data set
imds_large = imageDatastore(fullfile(dataFolder_larger, categories), LabelSource = 'foldernames');
% Counts the number of images in each category of imageDatastore object
tbl_large = countEachLabel(imds_large);
% Display images in each category of imageDatastore object as a table
disp (tbl_large)
Label Count _____ _____ cat 500 dog 500
% Calculate the minimum number of images among all classes
minSetCount_large = min(tbl_large{:,2});
% Split the image datastore (imds) into two new datastores,
% each containing the same number of images as specified by minSetCount.
% This function ensures that each class has an equal number of random images.
imds_large = splitEachLabel(imds_large, minSetCount_large, 'randomize');
% Verify each set now has exactly the same number of images.
countEachLabel(imds_large)
ans = 2×2 table
 LabelCount
1cat500
2dog500
 
% External function to resize all images to 227 x 227 RGB in small data set for Alexnet
image_size = 227;
imds_large.ReadFcn = @(filename)readAndPreprocessImage(filename, image_size);

Divide large data set into training and validation sets

% split data into 70 % for training set, 30 % for validation set
[trainingSet_large, remainingSet_large] = splitEachLabel(imds_large, 0.7, 'randomized');
[testSet_large, validationSet_large] = splitEachLabel(remainingSet_large, 0.5, 'randomized');
% display number of training set images
countEachLabel(trainingSet_large)
ans = 2×2 table
 LabelCount
1cat350
2dog350
% display number of validation set images
countEachLabel(validationSet_large)
ans = 2×2 table
 LabelCount
1cat75
2dog75
% display number of validation set images
countEachLabel(testSet_large)
ans = 2×2 table
 LabelCount
1cat75
2dog75

Augment large training and validation sets

% Data augmentation
pixelRange = [-30 30];
scaleRange = [0.9 1.1];
imageAugmenter = imageDataAugmenter( ...
'RandXReflection',true, ...
'RandXTranslation',pixelRange, ...
'RandYTranslation',pixelRange, ...
'RandXScale',scaleRange, ...
'RandYScale',scaleRange);
 
% Define augmented training and validation data sets
% retrieve the input size of the first layer of the pretrained model
inputSize = model1.Layers(1).InputSize;
 
% Create an augmented image datastore using the training set and the input size of the model.
augimdsTrain_large = augmentedImageDatastore(inputSize(1:2),trainingSet_large, DataAugmentation = imageAugmenter);
disp(augimdsTrain_large.NumObservations)
700
% Create an augmented image datastore using the validation set and the input size of the model.
augimdsValidation_large = augmentedImageDatastore(inputSize(1:2),validationSet_large);
disp(augimdsValidation_large.NumObservations)
150
% Create an augmented image datastore using the test set and the input size of the model.
augimdsTest_large = augmentedImageDatastore(inputSize(1:2),testSet_large);
disp(augimdsTest_large.NumObservations)
150

Part 5.1: Train the network-classifier for longer time (i.e., increase the number of epochs)

Return to table of contents
% Variable To Be Explored: Investigate increasing MaxEpochs from 5 to 10 to 20
% Variable To Be Explored: Use small, UN-augmented data set
% All other variables held constant
% Use established "layers" modified pretrained alexnet network
 
epoch_model_small_data_accuracies = [];
epoch_models = ["5-epoch", "10-epoch", "20-epoch"];
 
model_options = trainingOptions('sgdm', ...
'MiniBatchSize',10, ...
'MaxEpochs',5, ...
'InitialLearnRate',1e-4, ...
'ValidationData',validationSet_small, ...
'Verbose',false, ...
'Plots','training-progress');
 
five_epoch_accuracy = model_optimization_explorer (trainingSet_small, validationSet_small, testSet_small, model_options, layers);
The validation accuracy is: 100.00 %
No images misclassified
 
epoch_model_small_data_accuracies = [epoch_model_small_data_accuracies, five_epoch_accuracy]; % Capture accuracy value
 
model_options = trainingOptions('sgdm', ...
'MiniBatchSize',10, ...
'MaxEpochs',10, ...
'InitialLearnRate',1e-4, ...
'ValidationData',validationSet_small, ...
'Verbose',false, ...
'Plots','training-progress');
 
ten_epoch_accuracy = model_optimization_explorer (trainingSet_small, validationSet_small, testSet_small, model_options, layers);
The validation accuracy is: 100.00 %
No images misclassified
 
epoch_model_small_data_accuracies = [epoch_model_small_data_accuracies, ten_epoch_accuracy]; % Capture accuracy value
 
model_options = trainingOptions('sgdm', ...
'MiniBatchSize',10, ...
'MaxEpochs',20, ...
'InitialLearnRate',1e-4, ...
'ValidationData',validationSet_small, ...
'Verbose',false, ...
'Plots','training-progress');
 
twenty_epoch_accuracy = model_optimization_explorer (trainingSet_small, validationSet_small, testSet_small, model_options, layers);
The validation accuracy is: 83.33 %
 
epoch_model_small_data_accuracies = [epoch_model_small_data_accuracies twenty_epoch_accuracy]; % Capture accuracy value
 
disp(epoch_models)
"5-epoch" "10-epoch" "20-epoch"
disp(epoch_model_small_data_accuracies)
1.0000 1.0000 0.8333
 
% Convert arrays to a table
dataframe_epochs_small_data = table(epoch_models', epoch_model_small_data_accuracies', 'VariableNames', {'model', 'accuracy (small data)'});
 
% Display the dataframe
disp(dataframe_epochs_small_data);
model accuracy (small data) __________ _____________________ "5-epoch" 1 "10-epoch" 1 "20-epoch" 0.83333
-- Conclusions/Lessons/Insights:
Epoch number does seem important. A maximum number of needed training epochs seems to be about 10.
Strategy => Consider no more than 10 training epochs going forward.

Part 5.2: Train the network-classifier with a larger number of images

Return to table of contents
% Variable To Be Explored: Investigate increasing MaxEpochs from 5 to 10 to 20
% Variable To Be Explored: Use large, UN-augmented data set
% All other variables held constant
% Use established "layers" modified pretrained alexnet network
 
epoch_model_large_data_accuracies = [];
epoch_models = ["5-epoch", "10-epoch", "20-epoch"];
 
model_options = trainingOptions('sgdm', ...
'MiniBatchSize',10, ...
'MaxEpochs',5, ...
'InitialLearnRate',1e-4, ...
'ValidationData',validationSet_large, ...
'Verbose',false, ...
'Plots','training-progress');
 
five_epoch_accuracy = model_optimization_explorer (trainingSet_large, validationSet_large, testSet_large, model_options, layers);
The validation accuracy is: 94.00 %
 
epoch_model_large_data_accuracies = [epoch_model_large_data_accuracies, five_epoch_accuracy]; % Capture accuracy value
 
model_options = trainingOptions('sgdm', ...
'MiniBatchSize',10, ...
'MaxEpochs',10, ...
'InitialLearnRate',1e-4, ...
'ValidationData',validationSet_large, ...
'Verbose',false, ...
'Plots','training-progress');
 
ten_epoch_accuracy = model_optimization_explorer (trainingSet_large, validationSet_large, testSet_large, model_options, layers);
The validation accuracy is: 92.67 %
 
epoch_model_large_data_accuracies = [epoch_model_large_data_accuracies, ten_epoch_accuracy]; % Capture accuracy value
 
model_options = trainingOptions('sgdm', ...
'MiniBatchSize',10, ...
'MaxEpochs',20, ...
'InitialLearnRate',1e-4, ...
'ValidationData',validationSet_large, ...
'Verbose',false, ...
'Plots','training-progress');
 
twenty_epoch_accuracy = model_optimization_explorer (trainingSet_large, validationSet_large, testSet_large, model_options, layers);
The validation accuracy is: 94.00 %
 
epoch_model_large_data_accuracies = [epoch_model_large_data_accuracies twenty_epoch_accuracy]; % Capture accuracy value
 
disp(epoch_models)
"5-epoch" "10-epoch" "20-epoch"
disp(epoch_model_large_data_accuracies)
0.9400 0.9267 0.9400
% Convert arrays to a table
dataframe_epochs_large_data = table(epoch_models', epoch_model_small_data_accuracies', epoch_model_large_data_accuracies' ...
, 'VariableNames', {'model', 'accuracy (small data)', 'accuracy (large data)'});
 
% Display the dataframe
disp(dataframe_epochs_large_data);
model accuracy (small data) accuracy (large data) __________ _____________________ _____________________ "5-epoch" 1 0.94 "10-epoch" 1 0.92667 "20-epoch" 0.83333 0.94
-- Conclusions/Lessons/Insights:
Accuracy with larger data sets did not decrease when the number of training epochs exceeded 10 and approached 20.
Accuracy with smaller data sets decreased when the number of training epochs exceeded 10 and approached 20.
For both large and small data sets, between 5 to 10 training epochs may be sufficient.
Strategy => Use no more than 10 training epochs going forward.
Strategy => Use larger data set

Part 5.3: Try different (%) values for partitioning the dataset into training and validation sets

Return to table of contents
% Variable To Be Explored: Various (%) values for partitioning the dataset into training, validation, and test sets
% Variable Selected/Locked After Testing: MaxEpochs held at 10
% Variable Selected/Locked After Testing: Use large, UN-augmented data set
% All other variables held constant
% Use established "layers" modified pretrained alexnet network

Load large data set and prepare various data set partitions

% Path to large data set
dataFolder_larger = './larger_data/larger_PetImages';
% Category names of large data set
categories = {'cat', 'dog'};
% Creates an imageDatastore object holding large data set
imds_large = imageDatastore(fullfile(dataFolder_larger, categories), LabelSource = 'foldernames');
% Counts the number of images in each category of imageDatastore object
tbl_large = countEachLabel(imds_large);
% Display images in each category of imageDatastore object as a table
disp (tbl_large)
Label Count _____ _____ cat 500 dog 500
% Calculate the minimum number of images among all classes
minSetCount_large = min(tbl_large{:,2});
% Split the image datastore (imds) into two new datastores,
% each containing the same number of images as specified by minSetCount.
% This function ensures that each class has an equal number of random images.
imds_large = splitEachLabel(imds_large, minSetCount_large, 'randomize');
% Verify each set now has exactly the same number of images.
countEachLabel(imds_large)
ans = 2×2 table
 LabelCount
1cat500
2dog500
 
% External function to resize all images to 227 x 227 RGB in small data set for Alexnet
image_size = 227;
imds_large.ReadFcn = @(filename)readAndPreprocessImage(filename, image_size);

Divide large data set into 60/20/20 training and validation sets

% split data into 60 % for training set, 20 % for validation set, 20 % for test set
[trainingSet_large_60_20_20, remainingSet_large_60_20_20] = splitEachLabel(imds_large, 0.6, 'randomized');
[testSet_large_60_20_20, validationSet_large_60_20_20] = splitEachLabel(remainingSet_large_60_20_20, 0.5, 'randomized');
% display number of training set images
countEachLabel(trainingSet_large_60_20_20)
ans = 2×2 table
 LabelCount
1cat300
2dog300
% display number of validation set images
countEachLabel(validationSet_large_60_20_20)
ans = 2×2 table
 LabelCount
1cat100
2dog100
% display number of validation set images
countEachLabel(testSet_large_60_20_20)
ans = 2×2 table
 LabelCount
1cat100
2dog100

Divide large data set into 70/15/15 training and validation sets

% split data into 70 % for training set, 15 % for validation set, 15 % for test set
[trainingSet_large_70_15_15, remainingSet_large_70_15_15] = splitEachLabel(imds_large, 0.7, 'randomized');
[testSet_large_70_15_15, validationSet_large_70_15_15] = splitEachLabel(remainingSet_large_70_15_15, 0.5, 'randomized');
% display number of training set images
countEachLabel(trainingSet_large_70_15_15)
ans = 2×2 table
 LabelCount
1cat350
2dog350
% display number of validation set images
countEachLabel(validationSet_large_70_15_15)
ans = 2×2 table
 LabelCount
1cat75
2dog75
% display number of test set images
countEachLabel(testSet_large_70_15_15)
ans = 2×2 table
 LabelCount
1cat75
2dog75

Divide large data set into 80/10/10 training and validation sets

% split data into 80 % for training set, 10 % for validation set, 10 % for test set
[trainingSet_large_80_10_10, remainingSet_large_80_10_10] = splitEachLabel(imds_large, 0.8, 'randomized');
[testSet_large_80_10_10, validationSet_large_80_10_10] = splitEachLabel(remainingSet_large_80_10_10, 0.5, 'randomized');
% display number of training set images
countEachLabel(trainingSet_large_80_10_10)
ans = 2×2 table
 LabelCount
1cat400
2dog400
% display number of validation set images
countEachLabel(validationSet_large_80_10_10)
ans = 2×2 table
 LabelCount
1cat50
2dog50
% display number of validation set images
countEachLabel(testSet_large_80_10_10)
ans = 2×2 table
 LabelCount
1cat50
2dog50

% Variable To Be Explored: Various (%) values for partitioning the dataset into training, validation, and test sets
% Variable Selected/Locked After Testing: MaxEpochs held at 10
% Variable Selected/Locked After Testing: Use large, UN-augmented data set
% All other variables held constant
% Use established "layers" modified pretrained alexnet network
 
split_models_accuracies = [];
split_models = ["60/20/20-split", "70/15/15-split", "80/10/10-split"];
 
model_options = trainingOptions('sgdm', ...
'MiniBatchSize',10, ...
'MaxEpochs',10, ...
'InitialLearnRate',1e-4, ...
'ValidationData',validationSet_large_60_20_20, ...
'Verbose',false, ...
'Plots','training-progress');
 
accuracy_60_20_20 = model_optimization_explorer (trainingSet_large_60_20_20, validationSet_large_60_20_20, testSet_large_60_20_20, model_options, layers);
The validation accuracy is: 96.50 %
 
split_models_accuracies = [split_models_accuracies, accuracy_60_20_20]; % Capture accuracy value
 
model_options = trainingOptions('sgdm', ...
'MiniBatchSize',10, ...
'MaxEpochs',10, ...
'InitialLearnRate',1e-4, ...
'ValidationData',validationSet_large_70_30, ...
'Verbose',false, ...
'Plots','training-progress');
 
accuracy_70_15_15 = model_optimization_explorer (trainingSet_large_70_15_15, validationSet_large_70_15_15, testSet_large_70_15_15, model_options, layers);
The validation accuracy is: 94.00 %
split_models_accuracies = [split_models_accuracies, accuracy_70_15_15]; % Capture accuracy value
 
model_options = trainingOptions('sgdm', ...
'MiniBatchSize',10, ...
'MaxEpochs',10, ...
'InitialLearnRate',1e-4, ...
'ValidationData',validationSet_large_80_10_10, ...
'Verbose',false, ...
'Plots','training-progress');
 
accuracy_80_10_10 = model_optimization_explorer (trainingSet_large_80_10_10, validationSet_large_80_10_10, testSet_large_80_10_10, model_options, layers);
The validation accuracy is: 96.00 %
 
split_models_accuracies = [split_models_accuracies, accuracy_80_10_10]; % Capture accuracy value
 
disp(split_models)
"60/20/20-split" "70/15/15-split" "80/10/10-split"
disp(split_models_accuracies)
0.9650 0.9400 0.9600
 
% Convert arrays to a table
dataframe_split_models = table(split_models', split_models_accuracies', 'VariableNames', {'model', 'accuracy'});
 
% Display the dataframe
disp(dataframe_split_models);
model accuracy ________________ ________ "60/20/20-split" 0.965 "70/15/15-split" 0.94 "80/10/10-split" 0.96
-- Conclusions/Lessons/Insights:
Data splits did not have much impact on model accuracy.
However, if a best split had to be chosen, it would be the 60/20/20 split.
Strategy => Split data set according to: 60 / 20 / 20 <-------> training set / validation set / test set split.

Part 5.4: Choose different hyperparameters (optimizer, learning rate, etc.) for network training

Return to table of contents

-- optimizer

% Variable To Be Explored: optimizer 'sgdm' versus optimizer 'adam'
% Variable Selected/Locked After Testing: 60 / 20 / 20 <-------> training set / validation set / test set split.
% Variable Selected/Locked After Testing: MaxEpochs held at 10
% Variable Selected/Locked After Testing: Use large, UN-augmented data set
% All other variables held constant
% Use established "layers" modified pretrained alexnet network
 
optimizer_models_accuracies = [];
optimizer_models = ["sdgm", "adam"];
 
model_options = trainingOptions('sgdm', ...
'MiniBatchSize',10, ...
'MaxEpochs',10, ...
'InitialLearnRate',1e-4, ...
'ValidationData',validationSet_large_60_20_20, ...
'Verbose',false, ...
'Plots','training-progress');
 
accuracy_60_20_20_sgdm = model_optimization_explorer (trainingSet_large_60_20_20, validationSet_large_60_20_20, testSet_large_60_20_20, model_options, layers);
The validation accuracy is: 96.00 %
 
optimizer_models_accuracies = [optimizer_models_accuracies, accuracy_60_20_20_sgdm]; % Capture accuracy value
 
model_options = trainingOptions('adam', ...
'MiniBatchSize',10, ...
'MaxEpochs',10, ...
'InitialLearnRate',1e-4, ...
'ValidationData',validationSet_large_60_20_20, ...
'Verbose',false, ...
'Plots','training-progress');
 
accuracy_60_20_20_adam = model_optimization_explorer (trainingSet_large_60_20_20, validationSet_large_60_20_20, testSet_large_60_20_20, model_options, layers);
The validation accuracy is: 93.00 %
 
 
optimizer_models_accuracies = [optimizer_models_accuracies, accuracy_60_20_20_adam]; % Capture accuracy value
 
disp(optimizer_models)
"sdgm" "adam"
disp(optimizer_models_accuracies)
0.9600 0.9300
 
% Convert arrays to a table
dataframe_optimizer_models = table(optimizer_models', optimizer_models_accuracies', 'VariableNames', {'model', 'accuracy'});
 
% Display the dataframe
disp(dataframe_optimizer_models);
model accuracy ______ ________ "sdgm" 0.96 "adam" 0.93
-- Conclusions/Lessons/Insights:
The sdgm optimizer seems to perform slightly better than the adam optimizer
The sdgm optimizer is a few minutes faster than the adam optimizer
Strategy => Use sdgm optimizer

-- learning rate

% Variable To Be Explored: Try several learning rates (LR) -- 0.00055, 0.0001*, 0.000055, 0.00001
% Original Learning Rate: 0.0001* or 1e-4*
% Variable Selected/Locked After Testing: 60 / 20 / 20 <-------> training set / validation set / test set split.
% Variable Selected/Locked After Testing: MaxEpochs held at 10
% Variable Selected/Locked After Testing: Use large, UN-augmented data set
% Variable Selected/Locked After Testing: Use optimizer 'sgdm'
% All other variables held constant
% Use established "layers" modified pretrained alexnet network
 
LR_models_accuracies = [];
LR_models = ["LR = 0.00055", "LR = 0.0001", "LR = 0.000055", "LR = 0.00001"];
 
 
model_options = trainingOptions('sgdm', ...
'MiniBatchSize',10, ...
'MaxEpochs',10, ...
'InitialLearnRate',0.00055, ...
'ValidationData',validationSet_large_60_20_20, ...
'Verbose',false, ...
'Plots','training-progress');
 
accuracy_LR_0_00055 = model_optimization_explorer (trainingSet_large_60_20_20, validationSet_large_60_20_20, testSet_large_60_20_20, model_options, layers);
The validation accuracy is: 50.00 %
 
LR_models_accuracies = [LR_models_accuracies, accuracy_LR_0_00055]; % Capture accuracy value
 
 
model_options = trainingOptions('sgdm', ...
'MiniBatchSize',10, ...
'MaxEpochs',10, ...
'InitialLearnRate',0.0001, ...
'ValidationData',validationSet_large_60_20_20, ...
'Verbose',false, ...
'Plots','training-progress');
 
accuracy_LR_0_0001 = model_optimization_explorer (trainingSet_large_60_20_20, validationSet_large_60_20_20, testSet_large_60_20_20, model_options, layers);
The validation accuracy is: 96.50 %
 
LR_models_accuracies = [LR_models_accuracies, accuracy_LR_0_0001]; % Capture accuracy value
 
 
model_options = trainingOptions('sgdm', ...
'MiniBatchSize',10, ...
'MaxEpochs',10, ...
'InitialLearnRate',0.000055, ...
'ValidationData',validationSet_large_60_20_20, ...
'Verbose',false, ...
'Plots','training-progress');
 
accuracy_LR_0_000055 = model_optimization_explorer (trainingSet_large_60_20_20, validationSet_large_60_20_20, testSet_large_60_20_20, model_options, layers);
The validation accuracy is: 96.00 %
 
LR_models_accuracies = [LR_models_accuracies, accuracy_LR_0_000055]; % Capture accuracy value
 
 
model_options = trainingOptions('sgdm', ...
'MiniBatchSize',10, ...
'MaxEpochs',10, ...
'InitialLearnRate',0.00001, ...
'ValidationData',validationSet_large_60_20_20, ...
'Verbose',false, ...
'Plots','training-progress');
 
accuracy_LR_0_00001 = model_optimization_explorer (trainingSet_large_60_20_20, validationSet_large_60_20_20, testSet_large_60_20_20, model_options, layers);
The validation accuracy is: 96.50 %
 
LR_models_accuracies = [LR_models_accuracies, accuracy_LR_0_00001]; % Capture accuracy value
 
 
disp(LR_models)
"LR = 0.00055" "LR = 0.0001" "LR = 0.000055" "LR = 0.00001"
disp(LR_models_accuracies)
0.5000 0.9650 0.9600 0.9650
 
% Convert arrays to a table
dataframe_LR_models = table(LR_models', LR_models_accuracies', 'VariableNames', {'model', 'accuracy'});
 
% Display the dataframe
disp(dataframe_LR_models);
model accuracy _______________ ________ "LR = 0.00055" 0.5 "LR = 0.0001" 0.965 "LR = 0.000055" 0.96 "LR = 0.00001" 0.965
-- Conclusions/Lessons/Insights:
The default learning rate (LR) is LR = 0.0001.
The model training process is very sensitive to learning rates that are too large.
Learning rates larger than the default learning rate of 0.0001 resulted in either model non training (shown above for LR = 0.00055 with accuracy of 0.5) or model crashing** during training (explored but not shown).
Smaller learning rates (LR = 0.000055 and LR = 0.00001) did not improve overall model accuracy which hovered at 96%.
Strategy => Use default learning rate of LR = 0.0001.
** If a crash is experienced during model training/script rerun, a solution may be to simply rerun this section. Crashes for LRs larger than 0.0001 (i.e. LR = 0.00055) occurred about half the time.

Part 5.5: Try different image data augmentation options (and values/ranges)

Return to table of contents
% Variable To Be Explored: Impact of augmented training data set versus non-augmented training data set
% Variable To Be Explored: Impact of different pixel ranges and scale ranges in augmented training data set
% Variable Selected/Locked After Testing: Use default learning rate of 0.0001
% Variable Selected/Locked After Testing: 60 / 20 / 20 <-------> training set / validation set / test set split.
% Variable Selected/Locked After Testing: MaxEpochs held at 10
% Variable Selected/Locked After Testing: Use large data set
% Variable Selected/Locked After Testing: Use optimizer 'sgdm'
% All other variables held constant
% Use established "layers" modified pretrained alexnet network

Define augmented training, validation, and test sets

% Data augmentation: baseline
pixelRange1 = [-30 30];
scaleRange1 = [0.9 1.1];
imageAugmenter1 = imageDataAugmenter( ...
'RandXReflection',true, ...
'RandXTranslation',pixelRange1, ...
'RandYTranslation',pixelRange1, ...
'RandXScale',scaleRange1, ...
'RandYScale',scaleRange1);
 
 
% Data augmentation: increased pixel range
pixelRange2 = [-40 40];
scaleRange2 = [0.9 1.1];
imageAugmenter2 = imageDataAugmenter( ...
'RandXReflection',true, ...
'RandXTranslation',pixelRange2, ...
'RandYTranslation',pixelRange2, ...
'RandXScale',scaleRange2, ...
'RandYScale',scaleRange2);
 
 
% Data augmentation: decreased pixel range
pixelRange3 = [-20 20];
scaleRange3 = [0.9 1.1];
imageAugmenter3 = imageDataAugmenter( ...
'RandXReflection',true, ...
'RandXTranslation',pixelRange3, ...
'RandYTranslation',pixelRange3, ...
'RandXScale',scaleRange3, ...
'RandYScale',scaleRange3);
 
 
% Data augmentation: increased scale range
pixelRange4 = [-30 30];
scaleRange4 = [0.85 1.15];
imageAugmenter4 = imageDataAugmenter( ...
'RandXReflection',true, ...
'RandXTranslation',pixelRange4, ...
'RandYTranslation',pixelRange4, ...
'RandXScale',scaleRange4, ...
'RandYScale',scaleRange4);
 
 
% Data augmentation: decreased scale range
pixelRange5 = [-30 30];
scaleRange5 = [0.95 1.05];
imageAugmenter5 = imageDataAugmenter( ...
'RandXReflection',true, ...
'RandXTranslation',pixelRange5, ...
'RandYTranslation',pixelRange5, ...
'RandXScale',scaleRange5, ...
'RandYScale',scaleRange5);
 
 
 
% Define augmented training and validation data sets
% retrieve the input size of the first layer of the pretrained model
inputSize = model1.Layers(1).InputSize;
 
% Create an augmented image datastore data set 60/20/20 splits
% Only the training set will undergo the 5 different augmentations defined above
% Training sets
augimdsTrain_large_60_20_20_ag1 = augmentedImageDatastore(inputSize(1:2),trainingSet_large_60_20_20, DataAugmentation = imageAugmenter1);
augimdsTrain_large_60_20_20_ag2 = augmentedImageDatastore(inputSize(1:2),trainingSet_large_60_20_20, DataAugmentation = imageAugmenter2);
augimdsTrain_large_60_20_20_ag3 = augmentedImageDatastore(inputSize(1:2),trainingSet_large_60_20_20, DataAugmentation = imageAugmenter3);
augimdsTrain_large_60_20_20_ag4 = augmentedImageDatastore(inputSize(1:2),trainingSet_large_60_20_20, DataAugmentation = imageAugmenter4);
augimdsTrain_large_60_20_20_ag5 = augmentedImageDatastore(inputSize(1:2),trainingSet_large_60_20_20, DataAugmentation = imageAugmenter5);
% Validation set
validationSet_large_60_20_20;
% Test set
testSet_large_60_20_20;
 
 
 
 
augmentation_models_accuracies = [];
augmentation_models = ["No Augmentation", "Augmentation_1: baseline", ...
"Augmentation_2: inc pixel range", "Augmentation_3: dec pixel range", ...
"Augmentation_4: inc scale", "Augmentation_5: dec scale"];
 
% No data augmentation
model_options = trainingOptions('sgdm', ...
'MiniBatchSize',10, ...
'MaxEpochs',10, ...
'InitialLearnRate',0.0001, ...
'ValidationData',validationSet_large_60_20_20, ...
'Verbose',false, ...
'Plots','training-progress');
 
accuracy_no_aug = model_optimization_explorer (trainingSet_large_60_20_20, validationSet_large_60_20_20, testSet_large_60_20_20, model_options, layers);
The validation accuracy is: 97.00 %
 
augmentation_models_accuracies = [augmentation_models_accuracies, accuracy_no_aug]; % Capture accuracy value
 
% Data augmentation: baseline
model_options = trainingOptions('sgdm', ...
'MiniBatchSize',10, ...
'MaxEpochs',10, ...
'InitialLearnRate',0.0001, ...
'ValidationData',validationSet_large_60_20_20, ...
'Verbose',false, ...
'Plots','training-progress');
 
accuracy_ag1 = model_optimization_explorer (augimdsTrain_large_60_20_20_ag1, validationSet_large_60_20_20, testSet_large_60_20_20, model_options, layers);
The validation accuracy is: 93.50 %
 
augmentation_models_accuracies = [augmentation_models_accuracies, accuracy_ag1]; % Capture accuracy value
 
% Data augmentation: increased pixel range
model_options = trainingOptions('sgdm', ...
'MiniBatchSize',10, ...
'MaxEpochs',10, ...
'InitialLearnRate',0.0001, ...
'ValidationData',validationSet_large_60_20_20, ...
'Verbose',false, ...
'Plots','training-progress');
 
accuracy_ag2 = model_optimization_explorer (augimdsTrain_large_60_20_20_ag2, validationSet_large_60_20_20, testSet_large_60_20_20, model_options, layers);
The validation accuracy is: 97.00 %
 
augmentation_models_accuracies = [augmentation_models_accuracies, accuracy_ag2]; % Capture accuracy value
 
% Data augmentation: decreased pixel range
model_options = trainingOptions('sgdm', ...
'MiniBatchSize',10, ...
'MaxEpochs',10, ...
'InitialLearnRate',0.0001, ...
'ValidationData',validationSet_large_60_20_20, ...
'Verbose',false, ...
'Plots','training-progress');
 
accuracy_ag3 = model_optimization_explorer (augimdsTrain_large_60_20_20_ag3, validationSet_large_60_20_20, testSet_large_60_20_20, model_options, layers);
The validation accuracy is: 93.50 %
 
augmentation_models_accuracies = [augmentation_models_accuracies, accuracy_ag3]; % Capture accuracy value
 
% Data augmentation: increased scale range
model_options = trainingOptions('sgdm', ...
'MiniBatchSize',10, ...
'MaxEpochs',10, ...
'InitialLearnRate',0.0001, ...
'ValidationData',validationSet_large_60_20_20, ...
'Verbose',false, ...
'Plots','training-progress');
 
accuracy_ag4 = model_optimization_explorer (augimdsTrain_large_60_20_20_ag4, validationSet_large_60_20_20, testSet_large_60_20_20, model_options, layers);
The validation accuracy is: 96.00 %
 
augmentation_models_accuracies = [augmentation_models_accuracies, accuracy_ag4]; % Capture accuracy value
 
% Data augmentation: decreased scale range
model_options = trainingOptions('sgdm', ...
'MiniBatchSize',10, ...
'MaxEpochs',10, ...
'InitialLearnRate',0.0001, ...
'ValidationData',validationSet_large_60_20_20, ...
'Verbose',false, ...
'Plots','training-progress');
 
accuracy_ag5 = model_optimization_explorer (augimdsTrain_large_60_20_20_ag5, validationSet_large_60_20_20, testSet_large_60_20_20, model_options, layers);
The validation accuracy is: 96.00 %
 
augmentation_models_accuracies = [augmentation_models_accuracies, accuracy_ag5]; % Capture accuracy value
 
 
disp(augmentation_models)
"No Augmentation" "Augmentation_1: baseline" "Augmentation_2 inc pixel range" "Augmentation_3: dec pixel range" "Augmentation_4: inc scale" "Augmentation_5: dec scale"
disp(augmentation_models_accuracies)
0.9700 0.9350 0.9700 0.9350 0.9600 0.9600
% Convert arrays to a table
augmentation_models = ["No Augmentation", "Augmentation_1: baseline", ...
"Augmentation_2: inc pixel range", "Augmentation_3: dec pixel range", ...
"Augmentation_4: inc scale", "Augmentation_5: dec scale"];
dataframe_augmentation_models = table(augmentation_models', augmentation_models_accuracies', ...
'VariableNames', {'model', 'accuracy'});
 
% Display the dataframe
disp(dataframe_augmentation_models);
model accuracy _________________________________ ________ "No Augmentation" 0.97 "Augmentation_1: baseline" 0.935 "Augmentation_2: inc pixel range" 0.97 "Augmentation_3: dec pixel range" 0.935 "Augmentation_4: inc scale" 0.96 "Augmentation_5: dec scale" 0.96
-- Conclusions/Lessons/Insights:
The two best accuracies were for the model using unaugmented training data and the model using augmented training data with increased pixel range.
Other training set augmentation formats yielded lower accuracies.
If an augmentation must be used, the increased pixel range format should be selected.
These accuracy results do not support the extra time and effort needed to augment the training data set.
Strategy => Use UN-augmented, large data set with 60/20/20 training/validation/test split.

Part 5.6: Use a different pretrained model

Return to table of contents
% Variable To Be Explored: Compare "layers" pretrainined alexNet network to another pretrained network
% Variable Selected/Locked After Testing: 60 / 20 / 20 <-------> training set / validation set / test set split.
% Variable Selected/Locked After Testing: MaxEpochs held at 10
% Variable Selected/Locked After Testing: Use large, UN-augmented data set
% Variable Selected/Locked After Testing: Use optimizer 'sgdm'
% All other variables held constant
 
pretrained_models_accuracies = [];
pretrained_models = ["Pretrained AlexNet", "Pretrained GoogleNet"];
 

Load and size large data set for AlexNet

% Path to large data set
dataFolder_larger = './larger_data/larger_PetImages';
% Category names of large data set
categories = {'cat', 'dog'};
% Creates an imageDatastore object holding large data set
imds_large = imageDatastore(fullfile(dataFolder_larger, categories), LabelSource = 'foldernames');
% Counts the number of images in each category of imageDatastore object
tbl_large = countEachLabel(imds_large);
% Display images in each category of imageDatastore object as a table
disp (tbl_large)
Label Count _____ _____ cat 500 dog 500
% Calculate the minimum number of images among all classes
minSetCount_large = min(tbl_large{:,2});
% Split the image datastore (imds) into two new datastores,
% each containing the same number of images as specified by minSetCount.
% This function ensures that each class has an equal number of random images.
imds_large = splitEachLabel(imds_large, minSetCount_large, 'randomize');
% Verify each set now has exactly the same number of images.
countEachLabel(imds_large)
ans = 2×2 table
 LabelCount
1cat500
2dog500
 
% External function to resize all images to 227 x 227 RGB in small data set for Alexnet
image_size = 227;
imds_large.ReadFcn = @(filename)readAndPreprocessImage(filename, image_size);

Divide large data set into 60/20/20 training and validation sets for AlexNet

% split data into 60 % for training set, 20 % for validation set, 20 % for test set
[trainingSet_large_60_20_20, remainingSet_large_60_20_20] = splitEachLabel(imds_large, 0.6, 'randomized');
[testSet_large_60_20_20, validationSet_large_60_20_20] = splitEachLabel(remainingSet_large_60_20_20, 0.5, 'randomized');
% display number of training set images
countEachLabel(trainingSet_large_60_20_20)
ans = 2×2 table
 LabelCount
1cat300
2dog300
% display number of validation set images
countEachLabel(validationSet_large_60_20_20)
ans = 2×2 table
 LabelCount
1cat100
2dog100
% display number of validation set images
countEachLabel(testSet_large_60_20_20)
ans = 2×2 table
 LabelCount
1cat100
2dog100
 
% Use pretrained alexnet "layers"
model_options = trainingOptions('sgdm', ...
'MiniBatchSize',10, ...
'MaxEpochs',10, ...
'InitialLearnRate',0.0001, ...
'ValidationData',validationSet_large_60_20_20, ...
'Verbose',false, ...
'Plots','training-progress');
 
accuracy_alexnet = model_optimization_explorer (trainingSet_large_60_20_20, validationSet_large_60_20_20, ...
testSet_large_60_20_20, model_options, layers);
The validation accuracy is: 93.50 %
 
pretrained_models_accuracies = [pretrained_models_accuracies, accuracy_alexnet]; % Capture accuracy value
 
 
 

Reload and resize large data set for GoogleNet

% Path to large data set
dataFolder_larger = './larger_data/larger_PetImages';
% Category names of large data set
categories = {'cat', 'dog'};
% Creates an imageDatastore object holding large data set
imds_large = imageDatastore(fullfile(dataFolder_larger, categories), LabelSource = 'foldernames');
% Counts the number of images in each category of imageDatastore object
tbl_large = countEachLabel(imds_large);
% Display images in each category of imageDatastore object as a table
disp (tbl_large)
Label Count _____ _____ cat 500 dog 500
% Calculate the minimum number of images among all classes
minSetCount_large = min(tbl_large{:,2});
% Split the image datastore (imds) into two new datastores,
% each containing the same number of images as specified by minSetCount.
% This function ensures that each class has an equal number of random images.
imds_large = splitEachLabel(imds_large, minSetCount_large, 'randomize');
% Verify each set now has exactly the same number of images.
countEachLabel(imds_large)
ans = 2×2 table
 LabelCount
1cat500
2dog500
 
% External function to resize all images to 224 x 224 RGB in small data set for GoogleNet
image_size = 224;
imds_large.ReadFcn = @(filename)readAndPreprocessImage(filename, image_size);

Re-divide large data set into 60/20/20 training and validation sets for GoogleNet

% split data into 60 % for training set, 20 % for validation set, 20 % for test set
[trainingSet_large_60_20_20, remainingSet_large_60_20_20] = splitEachLabel(imds_large, 0.6, 'randomized');
[testSet_large_60_20_20, validationSet_large_60_20_20] = splitEachLabel(remainingSet_large_60_20_20, 0.5, 'randomized');
% display number of training set images
countEachLabel(trainingSet_large_60_20_20)
ans = 2×2 table
 LabelCount
1cat300
2dog300
% display number of validation set images
countEachLabel(validationSet_large_60_20_20)
ans = 2×2 table
 LabelCount
1cat100
2dog100
% display number of validation set images
countEachLabel(testSet_large_60_20_20)
ans = 2×2 table
 LabelCount
1cat100
2dog100
 
% Use pretrained googlenet "goog_net"
goog_net = googlenet;
 
% Modify the last 3 layers of goog_net for specific classification task
%--Show current last three layers of goog_net
current_layers = goog_net.Layers(end-2:end)
current_layers =
3×1 Layer array with layers: 1 'loss3-classifier' Fully Connected 1000 fully connected layer 2 'prob' Softmax softmax 3 'output' Classification Output crossentropyex with 'tench' and 999 other classes
%--Define needed last three layers of goog_net
new_layers = [
fullyConnectedLayer(number_of_classes,'Name','fc','WeightLearnRateFactor',10,'BiasLearnRateFactor',10)
softmaxLayer('Name','softmax')
classificationLayer('Name','output')];
%--Show needed last three layers of goog_net
new_layers
new_layers =
3×1 Layer array with layers: 1 'fc' Fully Connected 2 fully connected layer 2 'softmax' Softmax softmax 3 'output' Classification Output crossentropyex
%--Replace the last layers in goog_net
goog_net = layerGraph(goog_net);
goog_net = replaceLayer(goog_net,'loss3-classifier',new_layers(1));
goog_net = replaceLayer(goog_net,'prob',new_layers(2));
goog_net = replaceLayer(goog_net,'output',new_layers(3));
%--Show newly replaced last three layers of goog_net
current_layers = goog_net.Layers(end-2:end)
current_layers =
3×1 Layer array with layers: 1 'fc' Fully Connected 2 fully connected layer 2 'softmax' Softmax softmax 3 'output' Classification Output crossentropyex
 
% Use pretrained googlenet "goog_net"
model_options = trainingOptions('sgdm', ...
'MiniBatchSize',10, ...
'MaxEpochs',10, ...
'InitialLearnRate',0.0001, ...
'ValidationData',validationSet_large_60_20_20, ...
'Verbose',false, ...
'Plots','training-progress');
 
accuracy_googlenet = model_optimization_explorer (trainingSet_large_60_20_20, validationSet_large_60_20_20, ...
testSet_large_60_20_20, model_options, goog_net);
The validation accuracy is: 99.00 %
 
pretrained_models_accuracies = [pretrained_models_accuracies, accuracy_googlenet]; % Capture accuracy value
disp(pretrained_models)
"Pretrained AlexNet" "Pretrained GoogleNet"
disp(pretrained_models_accuracies)
0.9350 0.9900
% Convert arrays to a table
 
dataframe_pretrained_models = table(pretrained_models', pretrained_models_accuracies', ...
'VariableNames', {'model', 'accuracy'});
 
% Display the dataframe
disp(dataframe_pretrained_models);
model accuracy ______________________ ________ "Pretrained AlexNet" 0.935 "Pretrained GoogleNet" 0.99
-- Conclusions/Lessons/Insights:
The GoogleNet model had a higher accuracy than the AlexNet model (99% versus 93.5%, repectively as shown above).
The higher GoogleNet accuracy required more training time (30 minutes) compared to AlexNet (20 minutes).
This seems reasonable since the GoogleNet network is a more complex model. This relationship is shown in the figure below.
Source : https://www.mathworks.com/solutions/deep-learning/models.html
Strategy => If longer training time is within cost/time restraints and the additional 5% increase in accuracy is mission critical, then the GoogleNet modle should be used
instead of the AlexNet model.

FINAL CONCLUSIONS:

-- About 10 training epochs is enough.

-- A larger image set of a few hundred images per class should be used if possible.

-- The 60/20/20 partition for training/validation/test sets seems best.

-- The best optimizer is 'sgdm'.

-- The default training rate of 0.0001 should be used.

-- Data augmentation is not always helpful and, if used, should focus on slightly increased pixel range.

-- Using a different or more complex pretrained network can yield higher accuracies but this must be weighed against the increased training time for more complex models.

Functions

Return to table of contents
A function to explore the impact of changing various model training variables
Function returns overall training accuracy, displays heatmap confusion matrix, and displys misclassified images
function [prediction_accuracy] = model_optimization_explorer (training_data, validation_data, test_data, model_options, network_layers)
% Train model
predictive_model = trainNetwork(training_data, network_layers, model_options);
 
% Show confusion matrix
 
% The YPredAug variable holds the predicted labels for the images in the augmented validationSet.
% The probsAug variable contains the scores associated with each class for each image in the augmented validationSet.
% It is a matrix where each row corresponds to an image in the augmented validationSet, and each column corresponds to a class.
% The values in scores represent the confidence or probability of each class for each image.
[predicted_labels,label_probabilities] = classify(predictive_model,test_data);
true_labels = test_data.Labels; % determine actual validation image set labels
% Compute fraction of correctly predicted labels
% by comparing predicted versus actual augmented validation image set labels
prediction_accuracy = mean(predicted_labels == true_labels);
fprintf("The validation accuracy is: %.2f %%\n", prediction_accuracy * 100);
% Define class labels
class_names = unique([true_labels; predicted_labels]);
% Compute the confusion matrix
cf_matrix = confusionmat(true_labels, predicted_labels, 'Order', class_names);
% Visualize the confusion matrix as a heatmap
figure;
heatmap(class_names, class_names, cf_matrix);
title('Confusion Matrix');
xlabel('Predicted Labels');
ylabel('True Labels');
 
% Inspect incorrect classifications
 
% Find indices where the true labels and predicted labels do not match
incorrect_indices = find(true_labels ~= predicted_labels);
if isempty(incorrect_indices)
disp("No images misclassified")
else
figure;
% Display sample of the incorrectly classified images
image_sample = min(20, numel(incorrect_indices)); % Show at most 20 images
for i = 1:image_sample
subplot(4, 5, i);
idx = incorrect_indices(i);
% Extract the file path from the cell array
file_path_cell = validation_data.Files(idx);
% Convert the cell array into a string
file_path_string = char(file_path_cell);
% Show image with true and predicted labels
img = imread(file_path_string);
img_resized = imresize(img, [400, 600]);
imshow(img_resized)
title(sprintf('True: %s, \n Predicted: %s', true_labels(idx), predicted_labels(idx) ));
end
end
end